Effects of Bagging and Bias Correction on Estimators Defined by Estimating Equations
نویسندگان
چکیده
Bagging an estimator approximately doubles its bias through the impact of bagging on quadratic terms in expansions of the estimator. This difficulty can be alleviated by bagging a suitably bias-corrected estimator, however. In these and other circumstances, what is the overall impact of bagging and/or bias correction, and how can it be characterised? We answer these questions in the case of general estimators defined by estimating equations, including for example maximum likelihood and method of moments estimators. It is shown that, despite the considerable variety of estimators that can be constructed by bagging and bias correction, the number of modes of behaviour is very small. In particular, bagging a bias-corrected estimator produces a new estimator that is second-order equivalent to the original, unadjusted estimator. Furthermore, the conventional bagged estimator, and the standard bias-corrected estimator, represent virtually equal but opposite adjustments of the conventional estimator. In particular, bagging adds back the adjustment provided by bias correction. If we bag a doubly bias corrected estimator, constructed so as to counteract the tendency of bagging to exacerbate bias, then the result is an estimator that is second-order equivalent to the standard bias-corrected estimator. These results do not depend on the manner of bias correction; that procedure may be implemented using the jackknife, the parametric bootstrap or the nonparametric bootstrap. They show that, when bagging is applied to relatively conventional statistical problems, it cannot reliably be expected to improve performance. Its domain is, in effect, restricted to problems such as regression trees, where variability is so high that it cannot be plausibly modelled using the approach taken here.
منابع مشابه
Empirical Bayes Estimators with Uncertainty Measures for NEF-QVF Populations
The paper proposes empirical Bayes (EB) estimators for simultaneous estimation of means in the natural exponential family (NEF) with quadratic variance functions (QVF) models. Morris (1982, 1983a) characterized the NEF-QVF distributions which include among others the binomial, Poisson and normal distributions. In addition to the EB estimators, we provide approximations to the MSE’s of t...
متن کاملFinite sample adjustments in estimating equations and covariance estimators for intracluster correlations.
Bias-corrected covariance estimators are introduced in the context of an estimating equations approach for intracluster correlations among binary outcomes. Simulation study results show that the bias-corrected covariance estimators perform better than uncorrected sandwich estimators in terms of bias and coverage probabilities. Additionally, introduction of a matrix-based bias-correction into th...
متن کاملEstimation of Parameters for an Extended Generalized Half Logistic Distribution Based on Complete and Censored Data
This paper considers an Extended Generalized Half Logistic distribution. We derive some properties of this distribution and then we discuss estimation of the distribution parameters by the methods of moments, maximum likelihood and the new method of minimum spacing distance estimator based on complete data. Also, maximum likelihood equations for estimating the parameters based on Type-I and Typ...
متن کاملRobust estimating equations and bias correction of correlation parameters for longitudinal data
The estimation of correlation parameters has received attention for both its own interest and improvement of the estimation efficiency of mean parameters by the generalized estimating equations (GEE) approach. Many of the well-established methods for the estimation of correlation parameters can be constructed under the GEE framework which is, however, sensitive to outliers. In this paper, we co...
متن کاملEstimation in Simple Step-Stress Model for the Marshall-Olkin Generalized Exponential Distribution under Type-I Censoring
This paper considers the simple step-stress model from the Marshall-Olkin generalized exponential distribution when there is time constraint on the duration of the experiment. The maximum likelihood equations for estimating the parameters assuming a cumulative exposure model with lifetimes as the distributed Marshall Olkin generalized exponential are derived. The likelihood equations do not lea...
متن کامل